For this script we import the tidied dataframe generated by the Data Wrangling script, which was exported to your output folder.
Our data analyses start by looking into the variability of water depth along the thalweg. Remember that we obtained depth measurements at standard intervals, and also at intermediate locations if we encountered additional pool outlets and pool maximum depths. Although the intermediate measurements will be important for our calculations of residual pool depths (see file Longitudinal Survey 3 Residual Pools), for now we will just use the standard interval measurements to get estimates of reach water depth variability.
We will create a simple plot of water depth by distance along the reach for each site and each year, and will export this plot to your output folder. Note that water depth is distinct from bed elevation, and in reality neither the bed nor the water surface are horizontal over the course of a reach (they slope down).
Before proceeding with this section, think again about your hypotheses and predictions: What restoration action did you do, and how do you expect it to impact the variability of water depth? What was the mechanism and in what direction will variability change? It is essential that you consider and clarify these expectations now, because our brains are fantastic at post-hoc justifications.
The plots above should allow some insight into differences in depth variation among your reaches and/or years.
Depending on your choice of reaches or other factors that can come into play, the distance over which each survey was conducted may not match perfectly - for example if you wanted to capture a certain feature, or if the best reference reach was wider than your restored reach. You will have to consider whether these differences could impact depth variability independently of your restoration actions. If your reaches are otherwise well-matched but differ slightly in width (and therefore distance surveyed) then variability metrics are likely still valid as long as the number of depth measurements are the same. If you measured a different number of standard intervals at different locations or on different years, we will need to create subsets of the surveys with equal numbers of depth measurements before proceeding. We will do this now. If your surveys are of equal length, this section will not have an impact.
For those surveys that included more intervals, we select the subset beginning from the start (downstream) end. This might not correspond to a particular feature of interest (the reason you took more measurements). If you want to subset a particular different area, the simplest approach is to manually create a copy of your raw data .csv with only your chosen subset of intervals, and then return to the Data Wrangling step. However, if you do this, ensure you have a valid reason to choose that particular subset, otherwise this is a very effective way to introduce bias. Note that while you must use matching intervals for this script, it is not deemed necessary for the other scripts in this package (the other scripts are based on the the full surveys). As such, if you are manually subsetting your raw data, consider which subset you upload into each script.
Before proceeding with the variability metrics, take another look at the plots above. Unlike bed elevation data, with water depths we cannot immediately see which features are pools and where the pool outlets are. This is because, without local water slope information, we cannot distinguish how far a particular pool outlet would backwater upstream. Fortunately, we can estimate this using a value of the water surface slope, and we will do this when we focus on residual pools. For now, we will examine the heterogeneity of depth, which itself can be an important metric of a reach’s condition.
Water depth is of course sensitive to flow conditions during your survey, but variability in water depth is considered a flow-independent index of complexity (e.g. Kaufman et al 1999). We recommend withholding interpretation of absolute depth values until we consider the (flow-independent) residual pool depths in Longitudinal Survey 3 Residual Pools.
Below we generate the following three metrics for comparing thalweg water depth data:
Standard deviation (SD): SD can be interpreted in terms of the absolute variability of the data. It is expressed in metres and is probably the most familiar variability metric. We expect SD to be the most suitable variability metric when comparing the same type of habitat measured at similar spatial scales, for example when comparing changes over time at one location, or comparing well-matched reference/control reaches. We recommend using this metric unless you have good reason to do otherwise.
Coefficient of variation (CV): CV is normalised by dividing SD by the mean, giving a % variability. It is slightly more abstract but may be useful in comparing among distinct habitats / times, where the difference in overall depth is of significance. For example, if we assume the variability of depth is equal in two streams, but one was 0.2 m deep on average, while the other was 1 m deep, then the SD can be the same but the CV would differ.
Mean square error (MSE) of a linear regression of depth on distance: This approach has been used to explore variation in stream reaches where elevation data (rather than water depth data) is collected (e.g., Mossop and Bradford 2006). It is more sensitive to outliers (very deep / shallow values), and it reflects how much the observed thalweg variability differs from a straight line that is angled to fit the data. Because we have collected depth data, our data does not include an inherent slope that needs to be ‘controlled’ by fitting a regression line. However, we still include this for potential comparisons with elevation-derived data (though be very cautious if comparing among data collected in different ways).
There are many other metrics that can be applied to this data, each with their own advantages and disadvantages (e.g., Loess smoothing, fractal dimensions, wiggliness, see Bartley and Rutherfurd 2005 for a useful comparison). If a particular metric is common for your study system or species-of-interest, or there are demonstrated instances of its relationship to your ultimate restoration goal, it is probably worthwhile considering it. Although use of more complex/powerful/recent metrics can be tempting and should be encouraged, we do caution against the potential to ‘try out’ a number of different metrics, which can result in selecting the particular metrics which confirm your existing assumptions.
Since we want to keep things accessible and broadly applicable, we recommend the use of standard deviation of water depths as a metric of overall reach depth heterogeneity. It is always useful to spend time examining the visualisation of the data, and interpreting your metrics not in isolation, but alongside the other metrics we will generate later. For example, two streams with similar standard deviations might be distinguished based on an increased depth of pools, or, two streams with equally deep pools might have different standard deviations due to variability of shallow habitats.
The below table summarises variability by site and year, and is exported to your output folder.
The simple plot below visually contrasts the mean and standard deviation for multiple years and/or multiple sites (where available). Remember that, since water depth is flow dependent, focus primarily on interpreting the standard deviation (the bar magnitude) rather than the mean (the vertical location of the bar/mean).
We now ask whether the differences in water depth variability among sites and/or across time are statistically significant. Remember, we are focusing on variability and not absolute values here (due to potentially differing flow conditions). Because we are interested primarily in how our restoration action impacted the reach, we need to define the periods ‘before’ and ‘after’ restoration.
Enter your before and after years in the below code, you may have more than one year in each category. These categories should balance (equal number of years before as after), but they need not be consecutive years. If you do not have balanced data (e.g., only one year before project implementation, but three years after) be very cautious in running and interpreting an ANOVA. You could select only a subset of your data to keep the comparison balanced, but if you do this for more than one set of dates you must account for the multiple comparisons inflating the likelihood of a Type I error (false positive).
# Define "Before" and "After" years
before_years <- c(2024) #add more years inside brackets if needed, e.g. 'c(2024, 2025, 2026)'
after_years <- c(2025) #add more years inside brackets if needed, e.g. 'c(2027, 2029, 2036)'
Now we have established the comparison groups, and the next step is to confirm that we are using the appropriate statistical test. You may have to transform your data or consider alternative tests if the distribution of your data does not meet the assumptions of the statistical tests we use.
Because we are interested in comparing variability, certain tests that are common among ecologists may be unsuitable (e.g., ANOVA assumes equal variances across groups). We will instead use a variant of Levene’s test, called the Brown-Forsythe test, because this test compares the variance of groups (regardless of the absolute values of the averages) and is robust to non-normality and presence of outliers (because it uses the median, not the mean).
Although the Brown-Forsythe test is relatively robust, we should still take a look at the data distribution. This next section generates tests and figures that you should review to ensure your data are suitable for an ANOVA. In brief, look for the following:
## # A tibble: 4 × 3
## Site Time_Category p_value
## <chr> <fct> <dbl>
## 1 Brousseau Channel Before 0.00281
## 2 Brousseau Channel After 0.00497
## 3 Hatchery Channel Before 0.0145
## 4 Hatchery Channel After 0.102
## # A tibble: 4 × 4
## Site Time_Category Skewness Kurtosis
## <chr> <fct> <dbl> <dbl>
## 1 Brousseau Channel Before 0.292 2.39
## 2 Brousseau Channel After 0.295 2.23
## 3 Hatchery Channel Before -0.0162 2.23
## 4 Hatchery Channel After 0.0187 2.30
It can also be useful to examine scatterplots of your data. If you notice any particular trends (e.g., decreasing depth with distance upstream), consider what this might mean for your interpretation. What observations did you make in the field that might explain this? Trends that are unexpected might need some more consideration so that you can understand whether these unknown trends might cause differences in the statistical results among reaches/years.
It is quite likely that your data is non-normally distributed and features some skewness and kurtosis. It is difficult to assign a cut-off value to recommend transformations (e.g. log, or square root) but since this is a relatively robust test, if nothing jumps out at you as highly irregular, we will continue as is.
Below we run the tests for site, time, and the site-time interaction, and also export these to your output folder. The test for Site only (or time only) considers whether variability observed at one site (time) differs significantly from variability at other sites (time). The null hypothesis is that variability is consistent across all sites (times), while the alternative hypothesis is that variability in depth differs for at least one site (time). If your p value is <0.05 (or your chosen risk tolerance) then the variance does differ significantly.
The test for the Site and Year interaction considers whether variability changes observed at one site are significantly different from variability changes observed at other sites (e.g., did our restoration site variability change differently to the control, consistent with us having an impact?). The null hypothesis is that variability is consistent across all site-year combinations, while the alternative hypothesis is that variability in depth differs for at least one site-year combo. If the returned p value is <0.05 (or your chosen risk tolerance) then the variance does differ significantly.
##
## Brown-Forsythe Test for Site:
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 1 5.5245 0.01909 *
## 574
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Brown-Forsythe Test for Time Category:
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 1 69.713 5.152e-16 ***
## 574
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Brown-Forsythe Test for Site*Time Interaction:
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 3 74.916 < 2.2e-16 ***
## 572
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
So by now we either have some differences in depth variability that appear not to be due to chance, or we have no such detectable differences. Hopefully, any differences are relatively clear based on your data visualisations above (e.g., looking at the box-plot). However, if you are unsure which site/time the statistically-significant difference is attributable to (e.g., if you had multiple control/reference sites), you can run some post-hoc tests as below.
pairwise_test <- pairwise.t.test(
x = standard.df_subsampled$Thalweg_Depth_m,
g = standard.df_subsampled$Site,
p.adjust.method = "bonferroni"
)
print(pairwise_test)
##
## Pairwise comparisons using t tests with pooled SD
##
## data: standard.df_subsampled$Thalweg_Depth_m and standard.df_subsampled$Site
##
## Brousseau Channel
## Hatchery Channel <2e-16
##
## P value adjustment method: bonferroni
The above compares each pair of sites to see which differ (significant p value = yes they differ).
Once you have considered what the variability in water depth means for your restoration project, we can move on to Longitudinal Survey 3 Residual Pools to look in depth at the quantity and quality of pool habitat.
Bartley, R. and Rutherfurd, I., 2005. Measuring the reach‐scale geomorphic diversity of streams: Application to a stream disturbed by a sediment slug. River research and Applications, 21(1), pp.39-59.
Kaufmann, P.R., Levine, P., Peck, D.V., Robison, E.G. and Seeliger, C., 1999. Quantifying physical habitat in wadeable streams (p. 149). USEPA [National Health and Environmental Effects Research Laboratory, Western Ecology Division].
Mossop, B. and Bradford, M.J., 2006. Using thalweg profiling to assess and monitor juvenile salmon (Oncorhynchus spp.) habitat in small streams. Canadian Journal of Fisheries and Aquatic Sciences, 63(7), pp.1515-1525.